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Abstract — FOCal Underdetermined System Solver (FOCUSS) is 
a powerful tool for sparse representation and underdetermined 
inverse problems, which is extremely easy to implement. In 
this paper, we provide a comprehensive convergence analysis 
on the FOCUSS algorithm towards establishing a systematic 
convergence theory for it. First, we give a rigorous derivation 
for this algorithm exploiting the auxiliary function. Then, we 
prove its convergence. In particular, we systematically analyze 
its convergence rate for different sparsity parameter p and 
demonstrate its convergence rate by numerical experiments. 

Index Terms — FOCUSS algorithm, convergence, convergence 
rate, Auxilariy function, compressive sensing, superlinear con- 
vergence, linear convergence, global convergence theorem 



I. Introduction 

THE problem of finding sparse solutions to underde- 
termined linear problems from limited data arises in 
many applications, including compressive sensing/compressive 
sampling [l]-[4], biomagnetic imaging problem [5], spectral 
estimation, direction-of-arrival (DOA), signal reconstruction, 
[6]-[8], and so on. Mathematically, this problem is to solve 
the following combinatorial optimization problem [9]— [12]: 



by £ p diversity, instead we usually consider its approximate 
optimization problem [5]-[9], [14], [17], [21]-[24]: 



min ||s||o subject to x = As, 



(1) 



where ||s||o denotes the number of nonzero components in 
s, x — (x-y,--- ,x m ) T £ R m is an observed vector, A = 
[aiv , o n ] £ H mx " is a known basis matrix (m < n), 
s = (si,-- - ,s n ) T £ R™ is an unknown vector which 
represents n sparse sources or hidden sparse components, and 
m is the number of observations. The objective is to estimate 
the sources s such that s is as sparse as possible in the sense 
that most of components of s are zeros or approximate to 
zeros [5]-[9], [13]~[22]. 

In general, it is very difficult to directly solve the combinato- 
rial problem (Q]l if its dimension is high. Measuring the sparsity 
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min F(s) — min |sj 

s s i=i 

subject to : x = As 



(2) 



where < p < 2. Much attention has been paid to this 
problem and many algorithms have been developed for it, 
especially for the special case p = 1 [9]-[ll], [25], for 
example, linear programming (LP) [10], [16], [18], [20], 
[26], [27], basis pursuit (BP) [18], various greedy algorithms 
(e.g., shortest path decomposition [26], [28], £ P -BP with 
< p < 1 [29], MP and OMP [12], [30], [31], etc), 
least squares methods with l\ regularization (e.g., PDCO- 
LSQR [32], Homotopy [33],TNIPM [2], etc) and FOCUSS 
algorithm(s) [7], [19], [34]-[36]. 

Among them, LP and BP are time-consuming, the accuracy 
of MP and OMP, which are fast but just can achieve an 
approximate/rough solution to (ffji, is usually worse than the 
others however, £ P -BP [29] is iVP-hard and requires a lot 
of storage space. So the LP, BP and £ P -BP are not suitable 
for large scale problems. The least squares methods with 
£i regularization can be used to solve large scale problem 
potentially; but, the regularization parameters for imposing 
the sparseness constraint must be given in advance subjec- 
tively. Generally speaking, it is not easy to properly set the 
optimal sparseness regularization parameters. In contrast, the 
FOCUSS algorithms, developed originally by Gorodnitsky, 
Rao et al [5]-[8], [14], [17], [21], are not only very efficient 
in finding a precise solution for the ^-sparse representation 
problem (O but also have no regularization parameters to set, 
which are extremely easy to implement. Moreover, they are 
advantageous in terms of the computational complexity, and 
they are suitable even for large scale problems [37]. 

The standard FOCUSS algorithm can be addressed as 
follows: 

s (t+i) = n -i( s W) . A T ■ [A ■ n-V*>) • A?}- 1 ■ x, (3) 
where t = 0, 1, • • ■ , +oo and 



n( s W) = 



3 (*)| P -2 








,(*)|p-2 






„(*) Ip— 2 
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Or, it can be equivalently implemented by three steps [7] as 

^+ 1 )=diag(|^| 1 -f,...,| S W| 1 -i) 
q (t+D = [ A <fi+i)]+ X} where = AW {t+1 ^ 

s (t+i) = \Y {t+1) q (t+1) . 

For simplicity, the overall procedure for finding a sparse 
solution sW by FOCUSS can be notated as 

s « = FOCUSS(x, A,s {0 \num_iter), 

where is an initialization and num_iter is the pre- 
specified number of iterations. 

Rao et al [7] proved by the generalized Holder inequality 
that given ^ 0, the cost function F(s) is monotonically 

nonincreasing on the sequence f(sW) obtained by ©. 
Furthermore, based on the global convergence theorem ( GCT), 
they proved that the limit of any convergent subsequence of 
(s(')} is a stationary point of ||3}. 

Following Gorodnitsky, Rao et al's pioneering works [5]- 
[8], [14], [17], [21], in this paper, we further strengthen the 
FOCUSS algorithm theoretically and develop much stronger 
convergence results towards establishing a systematic conver- 
gence theory for it, in which the auxiliary function plays an 
essential role [38]. 

The rest of this paper is organized as follows. Section [TT] 
states some mild assumptions. A rigorous derivation of the 
FOCUSS algorithm is given in Section [III] The convergence 
of FOCUSS algorithm is proved in Section HVl Section [V] dis- 
cusses how the FOCUSS algorithm is related with the Newton 
method for ^-optimization problem (0. The convergence rate 
of FOCUSS is investigated in Section [VI] The dicussions and 
conclusions are given in IVHI and I Villi respectively. 

II. Some assumptions 
The FOCUSS iterative formula (01 can be rewritten as 



a (t+i) =n - 1 ( a W). c W =diag ( c (t 



,(*)|2-p 



,(*)|2-p 



where = A T ■ [A ■ rr x (sW) • A T ]^ 1 ■ x. Noting 

that rr 1 ^*)) = diag(|s^| 2 - p ,--- , \s$ \ 2 - p ) is a diagonal 
matrix, Vi, we have sf = if s^' = 0, i.e., s* = is a 
stationary point of (0 regardless of whether it is an optimal 
solution or not. Thus, given = 0, the FOCUSS algorithm 
is convergent and converges to zero. In addition, if x = 0, the 
FOCUSS algorithm (0 will directly find the sparsest solution 
s* = 0. Without loss of generality, in this paper we investigate 
the convergence issues of the FOCUSS algorithm under the 
following assumptions: 

Assumption 1. x ^ 0; 

Assumption 2. For A = [ai,-- - ,a n ] G R mxn (m < n), 

any m columns of A are linearly independence; 



Assumption 3. The initializations s 



(0) = rJO) 



„(0)iT 



used for the FOCUSS algorithm are entrywisely/ 'strictly 
nonzero, i.e., ^ for i = 1, • • • , n. 



III. A RIGOROUS DERIVATION FOR FOCUSS ALGORITHM 

The FOCUSS algorithm was derived and justified in [5]- 
[8], [14], [17], [21]. However, as will be explained in detail, 
the proof is not rigorous [37]. In this section, we propose a 
rigorous derivation for FOCUSS by constructing an auxiliary 
function, which sheds light on how the FOCUSS algorithm 
decreases the cost function during iterations. 

A. The existing derivation for FOCUSS algorithm 

The Lagrange multiplier method was employed to solve 
problem (|2]l in [7], in which the Lagrange function is 

L(s,a) = F(s) + a T (As - x), (4) 

where a is an m X 1 vector of the Lagrange multipliers. A 
necessary condition for the solution s* to exist is that (s*,a*) 
is a stationary point of the Lagrange function, i.e.: 



dL(s,a) _ dF(s) 



dL(s, a) 
da 



ds 



A 1 a = 



(5) 



= As - x = 



dF(s) 

where — - — = p-Yi(s)-s [7], [21]. Solving the equations set 

OS ^ 

(0, we can derive the FOCUSS equations as follows (see [7] 
and [37]): 

s = rr 1 ^) • a t ■ [a ■ rr 1 ^) • a 7 }- 1 ■ x. (6) 

Replacing s with s^' on the right side of d6), we have the 
FOCUSS iterate ©. 

However, as mentioned previously, theoretically the deriva- 
tion above for (01 is not rigorous [37]. Note that the equations 
set does not hold when < p < 1 because some 
components of s can be zeros. To be precise, the matrix Tl(s) 
does not exist in this case although the matrix II _1 (s) does, 
because P_2 — > oo. In order to solve this problem, next we 
propose a new derivation for © available for < p < 2. 

B. A new derivation for FOCUSS algorithm 

Let's start from the concept auxiliary function [38]. 

Definition 1. A function f(s\s^') with respect to s is said 
to be an auxiliary function to F(s) if f(s\s^') > F(s) and 
/( s W|s(*)) = F(s(*)). 

From Definition Q] carrying out some simple manipulations, 
we have the following lemma. 



Lemma 1. Let 



and 



F{ Si ) = \ Si \v 



(7) 



m£ ] ) = l\s?r* -st+ii-i).^, ( 8) 

where < p < 2. Then is an auxiliary function 

to F( Sl ), i.e., Vsi, /( Si |sf ) > F( Si ), and /(sflsf) = 

Proof: As showing in FigQ] we can readily prove it by 
verifying two conditions of Definition Q] ■ 
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Letting Substituting (\3[ into ( fl2b , we can immediately obtain the 

~ Jt^ solution of problem © as follows: 
F(s) = y^F(s-) 

fei 1 s = U- 1 (s^)-A T -[A-Il- 1 (s^)-A T ]- 1 -x. 

and n Thus, letting 

f(s\sW) = f(*M% = U-\s^) ■ A T ■ [A ■ n~ V*>) • A?]- 1 ■ x, 

i=l 

~ ~ we have As^ t+1 ^ — x and 

f(s\s^') is also an auxiliary function to F(s). Instead of the _ _ _ 

£ p optimization problem ©, now we consider its correspond- f{s [t+1) |s (t) ) = rnin/(s|s (t) ) < /(s (t) |s w ). 

ing auxiliary optimization problem as follows: _ 

On the other side, f(s\s^') is an auxiliary function to 

min/( S | S ( t )) = minVH| S) (t) | p - 2 -sf + (l-H)-|s ) ( * ) | p So 70<*>><*>) = F(*<*>). Therefore, we have 

s i=i /( a (' +1 >|s«) <F(«W). ■ 
subject to : cc = As 

(9) Theorem 2. Given an entrywisely nonzero initialization s^°\ 

Theorem 1. Tfs^ is entrywisely nonzero, i.e., |»C*)| >- 0, f/ie f/le ^rarive sequence { F ( s{t) )\ t=0 obtained by ® is con- 

FOCUSS iterate s(* +1 > obtained by © is f/ze globally optimal vergent. 

solution of quadratic optimization problem ®, tvAfcA utfu/fes Pro ^. since /( S | S W) is an auxiliary function to F(a), 

Aa<- + J = x and Vfl> we haye 

7( s ('+ 1 >| a W) < = (10) J (s | s (t)) > j?( s ) F( S (*+D) < /( a C*+i)| a (*)). (14) 

Prao/- For problem ©, we can construct the following Combing ([Toll with (O, we have the following iterative 

Lagrange function inequalities 

L(s,a) = f(s\s^) + a T (As-x). (11) < F(s {t+1) ) < /(s ( * +1) |s w ) < F(s w ), (15) 

The necessary condition for the solution s* to exist is that where t = 0, • • • , +oo. By ((B), we can recursively derive 

(s*,a*) is a stationary point of the Lagrange function, i.e., Q < £/ a (t+ih < |v s (t)\ < . . . < F(s (0) ) 



5L(s,a) df(s\sW) , 4T 



9s 9s 



A J e* = 



i.e., the sequence ^ F(s^') > is mono tonic ally non- 

L J t=o 



dL(s, a) _ ^ — x = increasing and bounded. Accordingly, |f(s'*')| is cop. 
da t L J t=o _ 

u vergent. ■ 

where = p ■ Tl(s^) ■ s. Noting that the vector s (t) _ . , , , . . . , , 

OS C. An extended rULUSb algorithm for the more general 

is entrywisely nonzero, we can compute IT(s(*)). Thus, we sparsity measure 

can further obtain T . , ■ , , r^^.r™ , •■ m i 

It is worth noting that the FOCUSS algorithm Q$ can be 

s _ _1 . n _1 (s^- 1 ) • A T a (12) straightforwardly extended for a general sparse representation 

P problem as follows: 

Combining (Q~2) with x = As, we can derive 



-l 

a = —p 



miriE^(hl) 



A-n 1 (*W)-A T x. (13) I su bject to: x = As 
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where F(s) is a more general sparsity measure than ^ p -norm 

such that F"(s) < 2 for s > 0, e.g., F(\s\) = In ]s| and 

F(|s|) = — l s l p (P < 0)- However, in this case II(s) is 

~F>(\ Si \) F'(\s n \y 



replaced with II (s) = diag 



in ®. 



2jsi| ' ' 2|a n | 
And we can analogously prove it by constructing the following 

optimization problem 



min jr /OiK^) 

s i=i 
subject to : x = As 



where the auxiliary function is 



F'(\s 



(*)i 



»(*) 



2,s 



(*) 



+m«ri) 



Then from (11 St . we have 

iV>)-/(s( t+1 >| S W) 

=f( s «) - /( S (* +1 V*)) -p[ s w - s c t+1 )] T n( s «) • s< t+1 ) 

n 

= E^(^ } ) ~ /(^ +1) l^ (t) )] - * (t+1) ] T n( s «) ■ 

n 

=Ef-^ t) r 2 -^ ) -^ +1) ]-^ ) + ^ +1) ] 

i=l 

-p.[ s «- s ( t+1 )] T .n( s «).s( t+1 ) 
-p.^-st'+Y'n^).^ 1 ' 

= |.[ s (*)_ s (*+l) ] T. n(s (t) ) . [s ( t )_ s (t + l) ] 



>C-|| s «- s ( i+1 )|| 2 >0, 



i.e. 



IV. The convergence of FOCUSS algorithm 

Theorem 3. For < p < 2, supposing A, x and are 
such that Assumption\l}Assumption\3\ the FOCUSS algorithm 
converges, i.e., the iterative sequence {s^} t _ Q obtained 
by (O is convergent. 

Proof: 

From Theorem Q] given an entry wisely nonzero s^°\ we 
have As^ = x for t = 1, • • • , +oo. Then, we can derive 

- s^ 1 )] = => [«« - ^ t+1 >] T A T = T (16) 

for t = 1, • • • , +oo. From ([3J, we have 

U(s^) ■ = A T ■ [A ■ n- l ( S W) ■ A T ]- 1 ■ x. (17) 

Combining ( [ToT l and ( [T7l >. we can obtain 



=[a (*) _ a (t+i)]T . A T ■ [A- n-^sW) • A 71 ]- 1 • B 

^.[A-n-^sW).^- 1 -* 

=0. 



(18) 



r ~ i +°° f " <t\ 1 
Since = <^ V Is V ? is convergent, the 

L Jt =o U=i J t=0 

sequence 

{|s! 4) | p ~ 2 }S? is !«wer bounded for < p < 2 and 
« = !,■•• , n. Let 



(7 = min{||sf ) | p - 2 |i = !,••■ ,n;t = 0,--- ,+oo}, 



i.e.,||sf ) | p_2 >C>0foH = l,--- ,nandt = 0,--- ,+oo. 



F(aW) - /(s( t+1 )|sW) > - s(* +1 )|| 2 > 0. (19) 

From Theorem |2] |f(s^')| is convergent, i.e., 
< lim F(sV>) = F» < +oo => 
Thus, from (fr9l and (fT5T l. we have 



+00 



o<cE 



,(*) _ „(*+!) 



4=0 



+oo 



<Y, F ( s(t) )- f{s (t+1) \s {t) ) 



t=0 



< J2 F ( s(t) ) - F(s (t+1) ) = F(a<°>) - F ( * 



t=o 
< + oo 



+ 00 

t=0 



(*) _ <,(«+!) 



< +oo. 



(20) 



Next, we will show that is a Cauchy sequence. 

Given two integers t\ and ti such that < t\ < ti, we have 



a (tl) _ a (*2) 



t2_ 

t=tl 



E 



a w _ 8 (t+i) 



s (t) _ „(*+!) 



<£|| S w_ a v 
t=ti 



<2-E 

t=ti 



a w_ a (t+i) 



because we have a 2 , < (a! + ■ • • + a„) 2 < 2(a 2 + • • • + a 2 ) 
if a < ai + ■ ■ ■ + a„ for nonnegative numbers a , • • • , a n . 
Then, from d20l i, we can derive 



a w 



< y/2- 



E \\s {t) - « (t+1) | 
ti 



as ti — > +oo, i.e., {s^}^~f£ is a Cauchy sequence so that it 
is convergent. 
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V. The relation between the FOCUSS method and 
the Newton method for l v optimization problem 

Theorem 4. The FOCUSS formula © is a quasi-Newton 
(but not exact Newton) algorithm for minimizing the Lagrange 
function L(s, a) in © with the following quasi-Hessian 
matrix 



H 





H12 




' 


A 


H >\ 


H 21 




[ A 1 ' 


P ■ n 



Proof: From ©, we have 
8L 

ds _ 

(21) 

From [40], we can derive d22b . By the quasi-Newton itera- 
tive formula, we have 



As — x 


= H 


a. 




X 


p ■ II(s) • s + A T a 


s 








■ a («+i) " 






. s( * +1) 




. s(t) 










■ -p-l 









dL(a 



9aW 



(*) «(*) 



x 




thus, we have 

s c*+i) = n-^sW) . a t ■ [a • n- x ( s W) • a t ]- x • s. 

Hence, the FOCUSS algorithm © is a quasi-Newton method. 
The proof is completed. ■ 
However, the FOCUSS algorithm Q is NOT an exact 
Newton method because H is just a quasi-Hessian rather than 
exact Hessian matrix except p = 2, noting that H = H only 
at p = 2, where Jf is the exact Hessian matrix given by 



H 



- d 2 L 


d 2 L - 

dnids 


- dsda 


d' 2 L 
~ds^ - 



' 


A 




p.(p-i)-n . 



(24) 

which is different from if. In the same manner as in d22l . we 
obtain ff in (f23b . Then, we have the Newton method for 
Lagrange function L(s, a) as follows: 



which differs from FOCUSS Q. Unfortunately, the numerical 
experiments show that the Newton method does not work well. 
This probably might be due to the non-positive definiteness of 
Hessian matrix H. 

VI. Convergence rate of FOCUSS algorithm 

One of the key measures of the performance of an iterative 
algorithm is its rate of convergence [41], [42]. We discuss the 
convergence rate of FOCUSS algorithm in this section, which 
is simply shown in Table U 

Suppose that the sequence {s^}^o converges to We 
say that the convergence is linear if there exists a constant 
H G (0, 1) such that 









s (t+l) 




s (t) 



H (a 



(t) S W) 



dL(a 



H H) 







)■ 


. s(t) 



da® 
dL(s^) 

ds^ 

X 





-p- [A-n _1 (*W) • A 7 ]- 1 cc 

^n- 1 ( s W)A T [An- 1 ( s ('))A T ]- 1 : r + (i - jir) 8 w 

i.e., the Newton iteration for minimizing Lagrange function 
L(s, a) in © is 

—n-\a®)A T [An- 1 (8®)A T ]- 1 x 

1 



1 - 



p-1 



,(*) 



,(*+!) 



< // as t — > +oo. 



(25) 



The sequence {s(*)}+^ is said to converge superlinearly if 
■\ s (t+i) _ s (*)|| 



lim 



= 0. 



||sW -sW|, 

One says that it converges sublinearly if it converges, but 

■| s (t+i) _ s (*)|| 



lim ■ 



3 (t) 



= 1. 



More generally, we say that its order of convergence is r (r > 
1) if 

|| s (*+i) _ s (*)|| 



,(*) 



,(*)||r 



< as i — > +oo, 



(26) 



where > is not necessarily less than 1. 

It is well known that many quasi-Newton methods converge 
superlinearly, whereas a Newton method converges quadrat- 
ically. In contrast, the steepest descent algorithms converge 
only at linear rate [42]. In general, the speed of convergence 
depends on r and (more weakly) on fi, e.g., a quadratically 
convergent sequence will always eventually converge faster 
than a linearly one [42]. 

A. The superlinear convergence of FOCUSS for < p < 1 
Let's begin with a lemma. 

a J eR mx " 



such that m < n. If 
T of £ p problem (O 



Lemma 2. Let A = [ai | • • • 

< p < 1, f/ie solution s* = [s*, ■ ■ ■ 
has k < m nonzero components, i.e., #{s*} < m, where 
denotes the number of nonzero components of a set. 

Proof: For p = 1 and < f> < 1, this lemma had been 
proven in [26] and [29], respectively. ■ 
Without loss of generality, suppose the nonzero components 



of s are s„ 



i * * * , s r 



Denote = {ni, 



{].,••• ,n} and Qo = {1, • • • , n}/f2jv as the subscripts 
sets of nonzero and zero components, respectively. Am = 
[a ni ,--- , a„ m ], Ao is the remaining matrix after remov- 
ing the columns of An from A. By analogy, = 
[s ni) "' ,s Um ] T and so are the corresponding nonzero and 
zero parts of s, respectively. 
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1 _ 


[ H n 


H\2 






H 22 







I 


" 




{H 11 — H 12H 22 l H 21) 1 







I 


-H12H22 










I 









H 22 _ 









1 








-p ■ (ATI 


^ATy 


1 


(An- 1 ^)- 


'in- 1 














H~ 1 A 1 '(AIL~ L A 1 ' 


1 


-^U- L A r (AU- L A r ) 






in 1 

p 




1 




-pip-^-iAU- 1 


A?)- 1 


(ATI 1 


A T )~ 1 ATl~ 1 










IT 1 A 1 ' (ATI 


- 1 a 


T ) 


-1 


^U-^A (ATI 


l A I ')- 1 An- x 


+ p( P -i) n J 



(22) 



(23) 



Lemma 3. For < p < 1, denote 

fti( s «) 4 | a W| 1 " p -««"(*J* ) )-oJ-[^-n- 1 (* w )-A r ]- 1 .x > 

(27) 

where sign(-) is a sign function. Suppose the FO- 
CUSS algorithm ® converges to s^*\ then h(s^) = 
[/ii(sM), ■ • • , h n (s ( -*^)] T is a (OJ)-vector given by 

\ !. ^ ^ 

[o, 4^ = 



w/iere j = 1, ■ 



Proq/:- First of all, for < p < 1, from dZ7}, if s}* ) = 
we can immediately derive L-(sW) = 0. 

In addition, since 8" is a stationary point of FOCUSS 
algorithm, i.e., 

s w = n- 1 • a t ■ [a ■ n-^sW) • a 11 }- 1 ■ x, 

which can be equivalently rewritten as follows: 
diag(s w ) ■ 1 = diag(s w ) 



x diag 



| S W|( 1 -rt.sign( S W),...,| S W|( 1 -^.sign( S W) 



xA T -[A-U- 1 (s^)-A T ]- 1 -x 



^■( s W) = ii/ s fVo 



(*) 



In summary, we have 



o, s ;* } =o 



where j = 1, • • • , n. ■ 

Lemma 4. Denote 

= n-^MM^A-n-^sM)-^]- 1 -^. (28) 

Suppose the FOCUSS algorithm (f3]l converges fo sW, if < 
p < 1, ?/zen 

Q(s w ) = G(« w )-dw«[M* ( * ) )] 

••• 
^(sW) ••• 



G(s w ) • 







is a diagonal (0,1 )-matrix such that 



1, i = j and ^ 
0, otherwise 

Proof: For < p < 1, from Lemma [2] we have 
< 772. Without loss of generality, suppose the 
nonzero components of s are s ni , s nm , where Sljv = 
{m, ■ ■ ■ , n m } C {1, • • • , n}. Then, we can get 



IT^aW) • A 1 



,(*)|2-p. t 



of, sfVooriGOiv 
otherwise 



1 



where [•]{ : stands for the ith row of a matrix. From Lemma|3] 
we have 



A ■diag[/i(s ( * ) )] 



(*) 



^ or j G fiiv 
otherwise 



where denotes the jth column of a matrix. In addition, 



lA-Tl-i(s^)-A T ]-i = tA N -TL-i(sP)-A T N ] 



Ti-l 



where n(s^) is a positive-definite diagonal matrix given by 



n(*£>) 



Consequently, from 



• • • .s 

we can derive 





(*) Ip— 2 



Q..( S (*)) = J ^(« ( * ) ) = 1, i=jandsfV0 
0, otherwise 



For simplicity, let us consider the simplest example for 
Lemma |4] where f2jv = {iir" >^m} = {!,••• , m}, for 
which we can readily verify Lemma |4] 

Denote 

9t (s) = TI^(s) ■ a? ■ [A ■ Tl-^s) ■ A?]- 1 ■ x, 
where i = 1, • • ■ , n. Then 

9(s) = [gi(s), ■ ■ ■ ,g n (s)] T = G(s) ■ s 
and the FOCUSS iterate (01 can be simply written as 

= ff ( a W). 
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Fig. 2. A numerical example demonstrating the Theorem [5] in which the convergence rate By) against iteration t of FOCUSS algorithm for < p < 1 
is plotted. Both A e ]R 12 5x 2 °° and ^ g Ri25xi ^ rant j om ly generated in MATLAB 2010b. For p = 0.6, p = 0.7, p = 0.8 and p = 0.95, given the 
entrywise-nonzero initializations (i.e., |a' ' | >- 0) generated randomly, the FOCUSS algorithm consistently converges to the sparse solutions superlinearly, 

, I ^ || s (*+i) _ s (*)|| 
which exactly satisfied ^{s 1 -* 1 } = m = 125 and lim — — j— r\T^~ = 



Theorem 5. Suppose A, x and s^ satisfy Assumption Q} 
Assumption [3] and the FOCUSS algorithm $3$ converges to 
If0<p< 1, its convergence rate is superlinear, i.e., 

•Ut+i) _ s (*)|| 



t-^+^o ||s(*) — aO)| 
Proof: From [40], we have 



= 0. 



^)=-M-( S ).^.M-( S ). 



ds 

r-l 



(is 

Letting M _i (Sj) = A ■ Tl~ 1 (s) ■ A T , we have 

Ho j j 

jj-if , d{aj -[A-Il-\s)-A T ]-^ X ) 
n « ( s > 71 



ds,- 



=(2-p)8 iJ -h j (a)+ 

ds,- 



d[A-n _1 (s) • A T ] 



ds,- 



.[A-n-^.A 2 "] 



Ti-1 



Then we can derive 

gg(f) = \ dgi(s) ' 
ds dsj 

=(2-p) 

x [diag(/ii(s), • • • , /i„(s)) - G(s) • diag(/ii(s), 
=(2 -p) (diag[fc(*)] - G(a) • diag[fc(«)]) 
=(2-p) (diag[h(s)] - Q(«)) . 

From Lemma [3] and Lemma [4] we can obtain 

dg(s^) 



ds 



0„ 



By the mean-value theorem, we have 
_ 



a (*+i)_ a W = ff ( a (*))_ fl ( a W) 



ds 1 
=0-fsW-sWl 



(*)l 



=(2 - p)<% • /!,.(«) - n« x (s) • of • [A • TI \s) ■ A T ] 1 as t +oa Hence 



lim ■ 



;(*+!) _ S W 



||s(') 



= 0. 



=(2 -p) [<5« • - n,^( S ) ■ a 4 J • [A • • A 1 

•(0,-- - ,^(s),-.. ,0) T ], 
where the indicator function is given by 



.Ms))] 



(29) 



(30) 



To illustrate the Theorem [5] we give a numerical example 
in Fig|2] in which the convergence rate is illustrated by 
defined as follows: 
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Note that ||s (t) - s w || -> as i ^ +00. Due to the limited 
machine accuracy of a computer, it is a difficult task to com- 
pute Ry) in numerical analysis when ||sW — s(*)|| — >. 0. For 
this reason, we can see in Fig|2]that RW — > first as expected 
in the Theorem [5] but then oscillates when the denominator 
_ s (*)|| i s so close to zero (i.e., ||s (t) - sW|| -> 0) 
that it is beyond the rounding precision of hardware floating 
point arithmetics of a computer. Incidentally, in Fig|2] we just 
intend to roughly demonstrate the curve of convergence rate by 
experiments. We merely wish that the numerical experiments 
could help a reader to understand the Theorem [5] intuitively. 



: \s\ I < e and i = 1, • • • , n} = n — m, 

where e > is a very small positive number (e.g., e = 
10 _30 J. For example, suppose the optimal solution is s* = 
[ci, C2, 0, 0, 0] T ; due to the numerical inaccuracy, the FO- 
CUSS algorithm with p = 1 can just converges to s" = 
[ci, C2, ei, €2, fi3] T ~ s* (i?Mf sW 7^ s*) so f/zaf sign(s*) =/= 
sign(s^). Thus, RW -> ^ R* = as t -> +00. 

Extensive experiments show that < JjW < 1 for FOCUSS 
algorithm with p = 1 (see Fig\3}. 



B. Convergence rate analysis on FOCUSS for p = 1 

Theorem 6. Suppose A, x and satisfy Assumption Q} 
Assumption \3\ and the FOCUSS algorithm (O converges to 
s(*h For p = 1, its convergence rate is first-order at most, 
i.e., 

■| a (t+i) _ s (*)|| 



lim 



: lim 

t— >+oo 



jjaW -s(*)|| 

|[J-G(«W)] -diag^(sW)] • [*W 
H.sW - .s(*)ll 



»W1 



< 1, 



where 



his^) =diag[sign{s^)]-A T [A-Tl- 1 {s^)-A T ]- 1 -x (31) 
and 



diag[sign(s^*')] 



sign(sY') 







c (*)\ 
Slgn(Sn ') 



Proof: In the Theorem [3] we have proved that for p = 1 
the FOCUSS algorithm is convergent, given a strictly nonzero 
initialization s^°\ i.e., the sequence {s^'^} t _ obtained by 
FOCUSS (0 with p — 1 is convergent. Hence, we have 



lim • 

t— >+oo 



,(*+!) 



,(*)l 



|sW -«W| 



< 1, 



or else, {«^} t _ would be divergent. Accordingly, from 
and J30b . we have 



»(*+!) 



lim ■ 



= lim • 

t— H-oo 



= lim ■ 

t— >+oo 



dgQ) 

ds 



,(*)! 



||s(*) — «(*) II ||*(*)-«W|| 
(diag[fc(«W)] - G(s«) • diag[h(«W)]) [«W 
||a(*) _ s (*)|| 

[J-G(sW)] .diag[h(«W)] • [s (t) 



C. Convergence analysis on FOCUSS for 1 < p < 2 

Theorem 7. Lef A = [ai|---|o n ] G R mxn satisfying 
Assumption [2] For 1 < p < 2 and x 7^ 0, f/ze £ p problem (O 
is non-concave and its solution s* = (s*,-- - ,s*) T is smc/z 
f/iaf #{i : s* = 0} < m - 1, i.e., #{s*} > n - m + 1. 

Proof: Note that A = [ai\---\a n ] e ft" ix,i satisfies 
Assumption |2] and rank(A) = m. Hence, there exists a matrix 
C = [a\ ■■■ \c n - m ] G R™ x (»-'») such that rank(C) = n-m 
and AC = mx(n _ m) , i.e., 0(A T ) = span(ci,--- ,c n _ m ). 
Given 

s° = A+z = A T (AA T y 1 x 1 

we have As = x. Then we can derive 

A{s° + CA) = x, 

where A G E,™ _m . Subsequently, the equalities constrained 
problem (O is equivalent to the following unconstrained 
optimization problem 



mini 71 (A) = min , 

A A t— 1 



(32) 



where s = (s x , • • • , s n ) 7 



dF 
9A 



p-C 1 



and 



»(*)! 



c> 2 F 
9A 5 " 



1 + CA. Then we can obtain 

si| p_1 ■ sign(si) 

s„| p_1 • sign(s n ) 
Isir 2 



p(p-l)-C 7 











|p-2 



,W1 



|| s (*) - a (*)| 
where h(s^) is given by (|3T1 >. 



<1. 



C. 

(33) 

For 1 < p < 2, i?A is nonnegative-definite. 

Suppose s* is the solution of problem (O, then it is a 
stationary point of L(s, a) with respect to s, i.e., 

3L( S *,a)_di?( S *) , A T a * = Q 



Remark 1. As mentioned in [26], for p — 1, f/ie optimal 
solution of problem ® satisfies ^{s^} — m if A and 
x satisfy Assumption Q] and Assumption [2] However, to our 
experience, due to the numerical inaccuracy of a computer, 
for p = 1, the FOCUSS algorithm usually derives an approx- 
imately optimal solution instead satisfying > m but 
#{| S W| >- e} = m, i.e., 



ds 



ds 



= P- 



\sl\P- 1 ■sign(sl) 
l^r 1 -signal) 



Kr^signCO 
where L(s,a) is given in and 



A 1 a* = 0, 



#{i : |s| I > e anti i 



,n} 



[A-U-\s*)- A T )- l -x^0. 



(34) 
(35) 
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Fig. 3. A numerical example demonstrating the Theorem [6] illustrating the convergence rate of FOCUSS for p = 1, where n = 30, A S R mx ™ and 
x € R mxl were randomly generated in MATLAB 2010b. 



Now we prove it by contradiction. Suppose #{i : s* = 
0} = #{oi, ■ ■ • , o fc } = k > to- 1, i.e., s 01 = 0, • • • , s 0fc = 0, 
respectively. Then we have 



In addition, for 1 < p < 2, lim s 



(*) 



„(*) 



lim Is 

i->+oo • 



(*)|1- 



t— >+oo 



-co. Hence, 



P ■ 



r t 
a 1 



' K 


p-i 


sis 






r t " 

< 






K 


p-i 


sis 






T 














+ 




«* : > 




. K 


p-i 


sis 






L o fc _ 







,(0,1-, 

„(*) 



gn( S * ) )-"f[^-n- 1 ( S W)-A T ]- 1 a; 



« = A 



• a = -p 



s* IP— 1 . 



sign(s*J 
1 -sign(5* ) 



|s* l p_1 -sign(s* ) 



= 0, 



where A € R m (fc > 777) and rank(A 



O) 



777. From 



0, we can derive a* = (A ■ A Q ) A Q -0 = 0, 



A t -ol* 

which contradicts (l35l >. Hence, we have #{i : s* = 0} < 
to — 1, equivalently, #{s*} > 77 — m + 1. ■ 

Lemma 5. Suppose the FOCUSS algorithm (01 converge* fo 
«(*) = [«« • • • , «Wl T , it, lim S W = s«. For 1< p < 



2, i/s 



(*) 



t— >+oo 



0, we Ziave 



lim hJs w ) = 0, 



,(*)) _ 

where hj(s^- t ' > ) is defined in i27\ , j = 1 



Proof: Note that the FOCUSS iterative sequence 
{s^ltS? i s convergent, where «W = [s^ , • • • , sl'' ) ] T . Thus, 
{Sj- }^q i s a l so convergent. Suppose that it converge to Sj*\ 
Then from 04l l and ( l35l l. we have 



So, if = 0, we have 



-1 • / * \ T * T 1 * 

■ sign(s ■) + a ■ a = a a 



T * 



[A • n _1 (s*) • A T ] _1 



sign( S y ) ) ■ aj ■ [A ■ n _1 (sW) • A T ] _1 • jc 

\sf\P~ 1 
aT-[A.n- 1 («W)-A T ]- 1 -x 



s (t)| P -i 



as t — > +oo, i.e., is an indeterminate form with 

respect to s^- as i — > +oo. Following the proof of Theorem[5] 
by L'Hopital's rule, we can derive 

iga(sf) ■ aJ[A ■ IT^s®) ■ A T ]^ 1 x 



|^( a W)|= lim 



< lim 

t— >+oc 



= lim 



af[A • II _1 (s( i) ) • A T ]- l x 



aj- [A-n- 1 ^*)) • A 7 ]" 1 - a; 



| s (*)| P -i 



= — lim 

P- 1 s f->-0 

2-P 



3 (*)| P -2 



^ lim l^l-loflA-n-H-^-A^I-^^W)! 
p-l s w_).o 

=0.|/7> W )I, 

i.e., < 0-|^(sW)|, which indicates that ^(gW) = 

or hj(s^) — oo. As hj(s^) is bounded, we have 
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Fig. 4. A numerical example demonstrating the Theorem [8] for the FOCUSS algorithm with 1 < p < 2, which is such that #{a(*)} = m. The dataset is 
generated by the Algorithm \T\ given in Appendix lAl and is with the parameters: m = 15, n = 20 and #{s'*'} = m = 15. 



lim /i 3 (s (i) ) = 0. 



t— >+oo 



Theorem 8. Suppose A, x and satisfy Assumption Q} 
Assumption [5] and the FOCUSS algorithm (O converges to 
Tjf i < p < 2 ant/ #{s < -*^} = m, it converges 
superlinearly, i.e., 

■| || 



lim 



0. 



t->+oo ||s(*) — s(*)| 

The Theorem |8] is simply demonstrated in Figj4] 

Proof: First, for 1 < p < 2 and ^s^*- 1 } = m, from 
Lemma|5] we have lim hj(s^) = if s*. = 0. In addition, 
in a similar way parallel to the proof Lemma [3] we can derive 
lim hj(sW) = 1 if ^ 0. So we have 

t->+oo J 3 



lim ftj(a (t) ) 



1, sW^O 



t— ¥ + 00 

for 1 < p < 2 and j = 1, 



0, 



„(*) _ 



, u. Then, in the same way as 
the proof of Theorem [5] we can derive 



3 (i+l) 



,(*) - 



<9s 

ff(« W )-fl(* W ) 



and 



%(s«) 



=0 • s 



9s 

it) . 



s 



(*) 



as i 
lim 



— > +oo. Thus, for 1 

U(t+i) _«(^" 



= if 



< p < 2, we have 
(*H = ™ ■ 



t->+oc ||s(*) — s(* 

Theorem 9. Suppose A, x and satisfy Assumption Q} 
Assumption ant/ the FOCUSS algorithm (0) converges to 
s^*\ For 1 < p < 2, if #{s(*)} = n, it converges linearly 
and 

'| S (*+D_ S (*)|| 



lim 



= 2-p<l. 



The Theorem [9] is demonstrated by an illustrative example 
in Fig|5] 



Proq/: For 1 < p < 2 and #{s(*)} = n, the sta- 
tionary point is nonzero entrywisely, i.e., s\*^ =^ for 
i = !,■■■ ,n. Note that 



«w = rr x -a t ■ [A-n-^flW)- a 

can be rewritten as follows: 
diag(sW) • 1 =diag(«W) 



Ti-l 



x diag 

X 



l^p-") • sign( a W), • • • , | S Wp-^ • sign( S W) 



A T -[A-n- 1 (sW)-A T ]- 1 -cc 
=diag(«W) • [M* W ). • • • , T = diag(s«) • 

=>■ diag[/»(« W )]=Inx». (36) 

On the other side, from Theorem [1] given a strictly nonzero 
initialization s( ', we have As® = x under the FOCUSS ^ 
for t = 1, • • ■ , +oo. Then, from (|28l l and d36l l, we have 

G(s w ) .diag[h(s w )] • [s w - s w ] 
=G(*M) • [«W -«W] 

^-^W) • a t • [a • n~ V* } ) • A 7 ]- 1 • A • [«« - ««] 

=n- x ( S W) ■ A T ■ [A ■ n _1 (* w ) • A T ] _1 • [a; - x] 

=0. 

Then, from d29l i, (f30b and d36l l. we can obtain 

e(f + 1) = s< t+1 ) - s ( *> = - ff («W) 

w ^W) _ 
as 

=(2 -p) (diag[/»(«W)] - G(s«) • diag[/»(*W)]) [«<*> - s w ] 

=(2 - p) ■ diag[fc(*W)] • [«W - «W] 
=(2-p)-e(f) =► 

U(*+i)- a (*)|| 



|| e (*+i)| 
lim — - — 7W7, = lim 

t— >+oo 



|| e (*)|| t^+oo || s (t) - s (* 



=2-p< 1 
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Fig. 5. An example demonstrating the Theorem [9] in which it is shown that lim iiC'J = 2 — p for 1 < p < 2 and #{sM} = n. In this example, the 

t— > + oo 

datasets are the same as in Fig|2] For p = 1.1, p = 1.3, p = 1.5, p = 1.7, p = 1.9 and p = 1.95, given the entry wise-nonzero initializations s' ' generated 

, I ^ || s (*+i) _ s (*)|| 
randomly, the FOCUSS algorithm uniformly converges with linear-convergence rate such that #{s^*H = n = 200 and lim ^- j-t, — = 2 — p. 

t— > + oo I sW — s'*J 
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Fig. 6. An example illustrating the Theorem [To] which shows that lim J?W 
yl £ R 13x20 and x e R 13xl , were randomly generated by the Algorithm [2] in Appendix [B] 



300 

/; - 5 = 15 <!') /. = 1.95 and 1 j- = n - (i = 14 

2 — p for 1 < p < 2 and m < #{sM < n. The datasets, containing 



100 150 200 
Iterations (t) 

15 (f) p = 1.95 and #{«W} 



for 1 < p < 2. 



infinitesimal expressions 



Theorem 10. Suppose A, x and satisfy Assumption Q} 
Assumption anc/ the FOCUSS algorithm (0) converges to 
s w . For 1 < p < 2, if m < = fe < n, it also 

converges at linear rate and 



,(t+i) 



«(*)! 



lim ■ 

i— »+co 



||a(*)-aW|| 

The Theorem [10] is demonstrated by a numerical example 
i Figl 



Proof: As described previously, sj^ and Sq' denote the 
nonzero and zero components of s^*\ respectively. Thus, we 
have m < = #{si^} — k < n. Noting that 

(i) (*) 

lim s Q = s Q = 0( n _M X i and following the proof of 
Theorem and Theorem [8j we can derive the following 



=,(*) 



,(*+!) _ „(*+!) 



— s 



(*) 



9o( s 



(*h _ 



<7o(* W ) 



ds 



JV 



(*) _ 







,(*) - 



(n-k)xk ' e AT 



o e 



jv \ 



i.e., 



,(*+!) _ 



o e 



jv J 



(37) 



Moreover, following the proof of Theorem [9] we can analo- 
geously obtain 



,(*+!) 
-jv 



-JV 



(2-P) 



-jv 



,(*+!) 
-JV 



o[e 



jv J 



(38) 



(39) 
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From d37l ) and d39l ), we can derive 



1 



,(t+i) 



'JV 



o[e 



o[e 



JV J 



o[e$] 



(40) 



as i 



Combing ([37]), ([38]) and gO]), we have 



Km ■ 

t— > + oo 



lfi( f )H 



lim 

t— >+oo 



,(*+!) 
'JV 

,{t+l) 
-O 



: lim 

t— >+oo 



: lim 

t— > + oo 



(2-P) 



e (t) - 

JV 

(t)i 



,(*) 

'JV 

(*) 
L c O J 

(t) 
JV 



e 

o[e$] 



o e 



JV 



(2-P) 



e JV 



'JV 



J (n-fc)xl 



e (t) 

JV 

'(n-i)xl 



Therefore, we also have lim 



= lim 



(2- 




e (t) 

e JV 




e (t) 

e JV 





,(t+l) 



1 < p < 2 if m < < n. 



t^+oo \\ S {t) - S (*)|| 



= 2 - p for 



VII. Discussions 

It is worth mentioning that for p < 0, the FOCUSS © 
is also applicable for sparse representation, its solutions also 
satisfy #{s^*'} < m and its convergence rate is also su- 
perlinear, where the corresponding sparseness measures are 
F(s) = In |s| and F(s) = -\s\~ p for p = and p < 0, re- 
spectively. Moreover, we can analogeously prove these points 
in the same way as for < p < 1. As a consequence, the 
convergence results of FOCUSS (O can be summarized in 
Table U 

To our experience, if A, x and satisfy Assumption Hi- 
Assumption [3] for 1 < p < 2, the FOCUSS algorithm © 
usually obtain an approximately sparse solution such 
that #{s(*)} = n but the number of significantly nonzero 
components of is less than n, i.e., = n but 

: \s\ | >- e} < n, where e is a small positive number 
(e.g., e = 10~ 2 ). So in practice, for 1 < p < 2, the Theorem|9] 
occurs much more often than the Theorem[8]and TheoremfTUl 

Comparing the Theorem[5]and the Theorem[9l we know that 
the FOCUSS algorithm converges more rapidly for < p < 1 
than 1 < p < 2, while it is relatively easy to get stuck into 
local minima however if p is too small (e.g.,p = 0.1) because 
the optimization problem (f2]l is not convex for < p < 1. 
Accordingly, it is suggested to select a value, slightly small 
than 1 but not too small, for p. Typically, we can set p = 0.8 
for the larger scale problems. 



VIII. Conclusions 

The FOCUSS method is one of the most efficient algorithms 
for sparse representation and compressive sensing, which is 
easy to implement. In this paper, we provide a thorough 
convergence analysis on this algorithm towards establishing 
a systematic convergence theory for it. At first, we propose 
a rigorous derivation via auxiliary function. Then, we prove 
its convergence. In particular, we have rigorously analyzed its 
convergence rate for different sparsity parameter and demon- 
strated its convergence rate by numerical experiments. 

Appendix A 

An algorithm generating datasets for Theorem[8] 

In order to demonstrate the Theorem [8] and Theorem [10] we 
specially design two algorithms generating some appropriate 
datasets satisfying the conditions of two theorems, respec- 
tively. 

Suppose the FOCUSS algorithm (f3]l converges to s^*' such 
that #{s < -*- ) } = m. Then x can be represented as 



x 



= As« = [A N ,A C 



'JV 



wheres^ = [s£V" ,So*„-,J T and = [4*V 



a (*)lT 
j 6 n m I 



(*) _ r«W 

! ' " " ! °On-ml a,1U "j/ 

are the zero and nonzero components of s^*\ respectively; 

by analogy, A = [An,Ao], An and Ao respectively 

(*) 

correspond to s N 



and Sq^ in terms of x 



A N ■ S/v } + A o ■ ° 

Theorem 11. For 1 < p < 2, supposing A, x and s(°> satisfy 
Assumption ^Assumption \3\ and the FOCUSS algorithm (© 
converges to s^*\ the sufficient and necessary condition sat- 
isfying #{s(*-*} = m is 



,(*) 



As 



(*) 



-1\T 



(*)|p-l . 
'ni I 



I (*)\ 

signal) 



1*) 



-l ■ 

• Slgn(Sni. 



Proof: Note that #{s<»} = = m and s 







(n—m) x 1 • 



(41) 



,(*) 







r (*) (*) 
[Soi ) " " " 5 So n - 

and Q51 l. we can derive 



(n — m) X 



i. For 1 < p < 2, from 



sign(slt ) ) 
sign(sl* ) ) 



sign(s^,* ) _ m ) 



A T ■ a w = 0, 



where 



Then, #{s w } = m 



,(*) 



\s$\ p - 



,(*) 

-1 
-1 







(n—m) X 1 



• / (*h 

,(*) 

'02 



sign(si*'" 



,(*) ip-i . 



sign(s£L) 



= 
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TABLE I 

Supposing A, x and s<°) satisfy AssumptionQ}Assumption[3] the convergence rate of FOCUSS algorithm (5J can be summarized as 

FOLLOWS. 



V 


— oo < p < 1 


p = l 


1 < p < 2 




#{ a W} = m 


#{s(*)} > m 


n — m - 


h l <#{iW}<n 


Convergence rate 


Superlinear 


Linear 


Superlinear if #{sW} = m 


Linear if m < #{s(*)} < n 


Theorems and Examples 


Theorem [5] and Fig|2| 


Theorem [6] and Fig|3] 


Theorem [8] and Fig|4| 


Theorem |9J Theorem|10| Fig|5Jand Fig|6] 



A T ■ a W = 





[A n 1 




A 7 ]- 1 




[Ajv • n 




■A T N ] 








)-A N 




(A N r- 


n(a& J 

i (* 


) • s W 

I S N 
)|P-1. 



Ams 



AfojV 
-(*) 



-^o ' (Aw ) 



1\T 



l^ir 1 - signal) 







(n — m) X 1 ■ 



The proof is completed. 



Remark 2. The condition (I41l i zn f/ie Theorem [77] can fee 
employed to generate a synthetic dataset for demonstrating 
the Theorem [3] The detailed procedure is described as Algo- 
rithm [7J 



Algorithm 1: Generating datasets for demonstrating the 

Theorem [8] 

Output: x G R mXl and A G R mxn 

1) Randomly generate A = [Ajy.Ao] G R mxn , where 
A N G R mxm and A G R mx («-™). 

2) Generate a vector t> € R mxl by eigenvalue 
decomposition (EVD) such that 



where 



v — 



Ao ' (An ) ' v = 0(«— mjxli 

" l^r^sign^) 

Isllr'-sign^i) 
. . „(*)it tu™„„u /m 











(*) 

S N — 


[*rii 



follows: 



4? = I pil • sign^W), i = 1, • • • , m. 



4) Compute x by x = As^*\ where = 



(*) 







(n— k) x l 



By Algorithm [JJ we randomly generated the datasets with 
the parameters: m — 15, n — 20 and #{«'*-*} = m = 15. We 
demonstrated the Theorem [8] on these datasets in Figj4] 



Appendix B 
An algorithm generating datasets for the 
Theorem [10] 

Theorem 12. For 1 < p < 2, supposing A, x and satisfy 
Assumption ^Assumption \3\ and the FOCUSS algorithm (fJJ 
converges to »W such that m < = =f/={sff} = 

k < n, the sufficient and necessary condition satisfying 

S = 0(n-fe)xl IS 



Aq ■ «W 



n 



v ( s n ) ' A N ] 1 x — 0(„_fc) x i, 

(42) 



w/zere 



,(*) 



x = As« = [A^Aq] 

(*) (*) lT j (*) 

5qi , ■ ■ ■ , ^o^^-fcj ufUZ S 



,(*) 



,(*) 



»1T 



are f/ze 



zero ana" nonzero components of s^*\ respectively. 

Proof: For 1 < p < 2, from (f34-b and d35b . we can derive 

sign(s£) 
sign(sl* ) ) 



,(*) ip— l 



• / (*) 
signal 



where 

a W = [A-n-^aWj-A 7 "]-^ 
Hence, we have 



•Ag-aW =0, 



[Ajv-n- 1 ^)^]-^ ^ o. 



(*) _ n 
S ~ U (n-fe)xl 



\ s ( :l 



ip-i 



sign(s£ ) 
sign(si* ) ) 

sign(so*U) 



= 



A- • «W 







(n— k) x 1 



A T -a.^ =A T -[A-U{s^)-A T r 1 x 
--Al -[A N ■Tl N (s ( * ) ) ■ A T N ]- X x 



— 0(n-fe)xl- 

■ 

Remark 3. By analogy, the condition ( 1421 i /n f/ze Theorem [72] 
a/so plays an essential role in generating the synthetic datasets 
for demonstrating the Theorem [TO] Based on it, the detailed 
procedure is given in Algorithm [2] 

In Fig [6] we demonstrated the Theorem [Tol on the datasets 
generated randomly by Algorithm [2] 
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Algorithm 2: Generating datasets for demonstrating the 

Theorem [TOl 

Output: x e R mxl and A e R mxn 

1) Randomly generate A N e R mxk and x £ W 
m < k < n. 

2) Given an entrywisely nonzero initialization 



where 



(*) 



w cm , compute s w 
algorithm (01 as follows 



by the FOCUSS 



(*) 



FOCUSSfaAN^ff) 



such that x = Ajq ■ 



N 



3) Compute A Q by EVD such that 



[A 



N 



n N ( s $)-A 



' x x = 0. 



4) Compute A = [An, Ao] and gW = 
where a; = As^. 



,(*) 







(n— k) x 1 
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